-
-
Notifications
You must be signed in to change notification settings - Fork 313
add security note about accessing urls #1600
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't mention any security considerations. It's a requirement that we made for security reasons, but it's not a security consideration itself.
We could talk about the security considerations that led to that decision, but that feels out-of-place to me. This section should be about things implementers need to consider and protect against. It's not supposed to be a place for us to justify decisions we made for security reasons.
Because this requirement is a "SHOULD" and not a "MUST", we could talk about the security considerations that implementers who chose to support that kind of retrieval need to be aware of. That's the only way I think this makes sense.
specs/jsonschema-core.md
Outdated
the host system to various security vulnerabilities, such as man-in-the-middle | ||
attacks or data leaks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't want to sound alarmist, but RCEs are also a potential if there's the potential of bad parsing and maliciuos intent. I think MitM is a low risk, but a noteable consideration.
How do you imagine data leaks might happen? By virtue of making a request to a URL from a system which should be invisible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A misbehaving implementation with access to the internet could send your data to another server, unrequested. To avoid this we instruct implementations to not make network calls by default. Thus making use of the network is opt-in, suggesting that the user understands the risks.
I can add the RCE risk to the list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sure there are nuances that I'm not familiar with in this area, but I don't see any of these things as risks worth mentioning.
A misbehaving implementation with access to the internet could send your data to another server, unrequested
I don't see how that's possible. We're talking about retrieving schemas over a network. Information is coming into the system, never out. The only data that could be leaked is what public schemas your network is accessing.
I think MitM is a low risk, but a noteable consideration.
I see MitM as essentially the same thing as data leakage. MitM is about covertly intercepting communications that are thought to be done privately. If you're retrieving a publicly available schema there's no need for MitM because the schema is already public. Again, the only information that could be exposed is which schemas you're accessing.
RCEs are also a potential if there's the potential of bad parsing and maliciuos intent.
I'm not sure what you mean by this. It would need to be code send by the attacker that gets executed by the implementation that isn't intended to be executed by the implementation. I don't see how that's possible.
Minor issue, but otherwise looks good. Thanks! |
Co-authored-by: Ben Hutton <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something that occurs to me is that real risk is evaluating untrusted schemas. Retrieving a schema you control that is accessible only on a VPN is a safe practice. The risk only comes when on an untrusted network because it opens the possibility that an untrusted schema can get into your system.
I think this section should focus on specific risks of network/filesystem access related to untrusted schemas. For example, if a system accepts user schemas and one of those schemas has a filesystem reference, you don't want an untrusted schema trying to access your filesystem.
Once we've covered that, then we can simply say that accessing schemas over an untrusted network opens the possibility of unintentionally evaluating untrusted schemas due to malicious actors. I wouldn't even mention specific types of network attacks. I think that's out of scope.
@@ -1990,6 +1990,13 @@ A malicious schema author could place executable code or other dangerous | |||
material within a `$comment`. Implementations MUST NOT parse or otherwise take | |||
action based on `$comment` contents. | |||
|
|||
When encountering an IRI that also represents a valid file system or network | |||
location, implementations are discouraged from automatically making an operation to | |||
access that location. Schema authors should take care when configuring |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The audience of the spec isn't schema authors and we shouldn't be speaking directly to them. We've moved away from that in other places and I think we should stick to that as a policy.
We can say the same thing this is saying, but from the perspective of the implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is probably the one place where it should be okay to address the schema author. They should know the risks of using an implementation.
FWIW this is what I note for security considerations in my implementation -- https://metacpan.org/pod/JSON::Schema::Modern#SECURITY-CONSIDERATIONS -- as regular expressions provide a potential vector for executing code or creating a DoS. |
@karenetheridge thank you. I notice that what you have is particularly focused on regular expressions, which are already included in the validation spec. |
Yes, since I don't support fetching schemas from disk or the network, I think this is the only direct source of vulnerabilities that a user might not already be aware of. I think the key to emphasize (and we can repeat it in a few places if relevant) is "do not trust schemas from external sources". |
What kind of change does this PR introduce?
clarification
Issue & Discussion References
Summary
Adds a security note about performing network operations when encountering URLs.
The last sentence in the addition was taken directly from @awwright's comment in the issue.
Does this PR introduce a breaking change?
no